traffic cone
Generalizing End-To-End Autonomous Driving In Real-World Environments Using Zero-Shot LLMs
Dong, Zeyu, Zhu, Yimin, Li, Yansong, Mahon, Kevin, Sun, Yu
Traditional autonomous driving methods adopt a modular design, decomposing tasks into sub-tasks. In contrast, end-to-end autonomous driving directly outputs actions from raw sensor data, avoiding error accumulation. However, training an end-to-end model requires a comprehensive dataset; otherwise, the model exhibits poor generalization capabilities. Recently, large language models (LLMs) have been applied to enhance the generalization capabilities of end-to-end driving models. Most studies explore LLMs in an open-loop manner, where the output actions are compared to those of experts without direct feedback from the real world, while others examine closed-loop results only in simulations. This paper proposes an efficient architecture that integrates multimodal LLMs into end-to-end driving models operating in closed-loop settings in real-world environments. In our architecture, the LLM periodically processes raw sensor data to generate high-level driving instructions, effectively guiding the end-to-end model, even at a slower rate than the raw sensor data. This architecture relaxes the trade-off between the latency and inference quality of the LLM. It also allows us to choose from a wide variety of LLMs to improve high-level driving instructions and minimize fine-tuning costs. Consequently, our architecture reduces data collection requirements because the LLMs do not directly output actions; we only need to train a simple imitation learning model to output actions. In our experiments, the training data for the end-to-end model in a real-world environment consists of only simple obstacle configurations with one traffic cone, while the test environment is more complex and contains multiple obstacles placed in various positions. Experiments show that the proposed architecture enhances the generalization capabilities of the end-to-end model even without fine-tuning the LLM.
- North America > United States > Washington > King County > Seattle (0.04)
- North America > United States > New York > Suffolk County > Stony Brook (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- (2 more...)
- Transportation > Ground > Road (1.00)
- Information Technology > Robotics & Automation (1.00)
- Automobiles & Trucks (1.00)
Tokenize the World into Object-level Knowledge to Address Long-tail Events in Autonomous Driving
Tian, Ran, Li, Boyi, Weng, Xinshuo, Chen, Yuxiao, Schmerling, Edward, Wang, Yue, Ivanovic, Boris, Pavone, Marco
The autonomous driving industry is increasingly adopting end-to-end learning from sensory inputs to minimize human biases in system design. Traditional end-to-end driving models, however, suffer from long-tail events due to rare or unseen inputs within their training distributions. To address this, we propose TOKEN, a novel Multi-Modal Large Language Model (MM-LLM) that tokenizes the world into object-level knowledge, enabling better utilization of LLM's reasoning capabilities to enhance autonomous vehicle planning in long-tail scenarios. TOKEN effectively alleviates data scarcity and inefficient tokenization by leveraging a traditional end-to-end driving model to produce condensed and semantically enriched representations of the scene, which are optimized for LLM planning compatibility through deliberate representation and reasoning alignment training stages. Our results demonstrate that TOKEN excels in grounding, reasoning, and planning capabilities, outperforming existing frameworks with a 27% reduction in trajectory L2 error and a 39% decrease in collision rates in long-tail scenarios. Additionally, our work highlights the importance of representation alignment and structured reasoning in sparking the common-sense reasoning capabilities of MM-LLMs for effective planning.
- Transportation > Ground > Road (0.87)
- Automobiles & Trucks (0.73)
- Information Technology > Robotics & Automation (0.62)
Searching Realistic-Looking Adversarial Objects For Autonomous Driving Systems
Numerous studies on adversarial attacks targeting self-driving policies fail to incorporate realistic-looking adversarial objects, limiting real-world applicability. Building upon prior research that facilitated the transition of adversarial objects from simulations to practical applications, this paper discusses a modified gradient-based texture optimization method to discover realistic-looking adversarial objects. While retaining the core architecture and techniques of the prior research, the proposed addition involves an entity termed the 'Judge'. This agent assesses the texture of a rendered object, assigning a probability score reflecting its realism. This score is integrated into the loss function to encourage the NeRF object renderer to concurrently learn realistic and adversarial textures. The paper analyzes four strategies for developing a robust 'Judge': 1) Leveraging cutting-edge vision-language models. 2) Fine-tuning open-sourced vision-language models. 3) Pretraining neurosymbolic systems. 4) Utilizing traditional image processing techniques. Our findings indicate that strategies 1) and 4) yield less reliable outcomes, pointing towards strategies 2) or 3) as more promising directions for future research.
- Transportation > Ground > Road (0.68)
- Information Technology > Robotics & Automation (0.53)
Racing With ROS 2 A Navigation System for an Autonomous Formula Student Race Car
Bradford, Alastair, van Breda, Grant, Fischer, Tobias
The advent of autonomous vehicle technologies has significantly impacted various sectors, including motorsport, where Formula Student and Formula: Society of Automotive Engineers introduced autonomous racing classes. These offer new challenges to aspiring engineers, including the team at QUT Motorsport, but also raise the entry barrier due to the complexity of high-speed navigation and control. This paper presents an open-source solution using the Robot Operating System 2, specifically its open-source navigation stack, to address these challenges in autonomous Formula Student race cars. We compare off-the-shelf navigation libraries that this stack comprises of against traditional custom-made programs developed by QUT Motorsport to evaluate their applicability in autonomous racing scenarios and integrate them onto an autonomous race car. Our contributions include quantitative and qualitative comparisons of these packages against traditional navigation solutions, aiming to lower the entry barrier for autonomous racing. This paper also serves as a comprehensive tutorial for teams participating in similar racing disciplines and other autonomous mobile robot applications.
- Europe > Germany (0.05)
- Oceania > Australia > Queensland (0.04)
Drive Like a Human: Rethinking Autonomous Driving with Large Language Models
Fu, Daocheng, Li, Xin, Wen, Licheng, Dou, Min, Cai, Pinlong, Shi, Botian, Qiao, Yu
In this paper, we explore the potential of using a large language model (LLM) to understand the driving environment in a human-like manner and analyze its ability to reason, interpret, and memorize when facing complex scenarios. We argue that traditional optimization-based and modular autonomous driving (AD) systems face inherent performance limitations when dealing with long-tail corner cases. To address this problem, we propose that an ideal AD system should drive like a human, accumulating experience through continuous driving and using common sense to solve problems. To achieve this goal, we identify three key abilities necessary for an AD system: reasoning, interpretation, and memorization. We demonstrate the feasibility of employing an LLM in driving scenarios by building a closed-loop system to showcase its comprehension and environment-interaction abilities. Our extensive experiments show that the LLM exhibits the impressive ability to reason and solve long-tailed cases, providing valuable insights for the development of human-like autonomous driving.
The Brilliance of Disabling Self-Driving Cars With a Traffic Cone
This article is adapted from Oversharing, a newsletter about the sharing economy. Self-driving cars have met their match in the form of the humble traffic cone. If you're on TikTok, you may have seen what I'm talking about: a viral video of San Francisco activists disabling autonomous Cruise and Waymo vehicles by placing bright orange traffic cones on their hoods. This content requires consent that you have not granted on Slate. To view this content please visit www.tiktok.com
- Transportation > Ground > Road (1.00)
- Automobiles & Trucks (1.00)
- Information Technology > Robotics & Automation (0.89)
- Transportation > Passenger (0.73)
Improving Autonomous Vehicle Mapping and Navigation in Work Zones Using Crowdsourcing Vehicle Trajectories
Chen, Hanlin, Luo, Renyuan, Feng, Yiheng
Prevalent solutions for Connected and Autonomous vehicle (CAV) mapping include high definition map (HD map) or real-time Simultaneous Localization and Mapping (SLAM). Both methods only rely on vehicle itself (onboard sensors or embedded maps) and can not adapt well to temporarily changed drivable areas such as work zones. Navigating CAVs in such areas heavily relies on how the vehicle defines drivable areas based on perception information. Difficulties in improving perception accuracy and ensuring the correct interpretation of perception results are challenging to the vehicle in these situations. This paper presents a prototype that introduces crowdsourcing trajectories information into the mapping process to enhance CAV's understanding on the drivable area and traffic rules. A Gaussian Mixture Model (GMM) is applied to construct the temporarily changed drivable area and occupancy grid map (OGM) based on crowdsourcing trajectories. The proposed method is compared with SLAM without any human driving information. Our method has adapted well with the downstream path planning and vehicle control module, and the CAV did not violate driving rule, which a pure SLAM method did not achieve.
- North America > United States > Indiana > Tippecanoe County > West Lafayette (0.05)
- North America > United States > Indiana > Tippecanoe County > Lafayette (0.05)
- North America > United States > Arizona (0.04)
- (8 more...)
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Communications > Social Media > Crowdsourcing (0.95)
SOTIF Entropy: Online SOTIF Risk Quantification and Mitigation for Autonomous Driving
Peng, Liang, Li, Boqi, Yu, Wenhao, Yang, Kai, Shao, Wenbo, Wang, Hong
Autonomous driving confronts great challenges in complex traffic scenarios, where the risk of Safety of the Intended Functionality (SOTIF) can be triggered by the dynamic operational environment and system insufficiencies. The SOTIF risk is reflected not only intuitively in the collision risk with objects outside the autonomous vehicles (AVs), but also inherently in the performance limitation risk of the implemented algorithms themselves. How to minimize the SOTIF risk for autonomous driving is currently a critical, difficult, and unresolved issue. Therefore, this paper proposes the "Self-Surveillance and Self-Adaption System" as a systematic approach to online minimize the SOTIF risk, which aims to provide a systematic solution for monitoring, quantification, and mitigation of inherent and external risks. The core of this system is the risk monitoring of the implemented artificial intelligence algorithms within the AV. As a demonstration of the Self-Surveillance and Self-Adaption System, the risk monitoring of the perception algorithm, i.e., YOLOv5 is highlighted. Moreover, the inherent perception algorithm risk and external collision risk are jointly quantified via SOTIF entropy, which is then propagated downstream to the decision-making module and mitigated. Finally, several challenging scenarios are demonstrated, and the Hardware-in-the-Loop experiments are conducted to verify the efficiency and effectiveness of the system. The results demonstrate that the Self-Surveillance and Self-Adaption System enables dependable online monitoring, quantification, and mitigation of SOTIF risk in real-time critical traffic environments.
- Asia > Middle East > Jordan (0.14)
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
- Asia > China > Beijing > Beijing (0.04)
- (8 more...)
- Transportation > Ground > Road (1.00)
- Information Technology (1.00)
- Automobiles & Trucks (1.00)
- Government > Regional Government > North America Government > United States Government (0.46)
Active Learning at Scale -- Building a Robust Data Unification Framework
Nauto is a leading provider of advanced driver assistance systems that improve the safety of commercial fleets today and the autonomous vehicles of tomorrow. To that end, we process terabytes of driving data a month, collected by windshield-mounted devices from vehicles around the world. This data is used to continuously improve the models that power our vehicle safety stack, from the real-time predictive collision alerts deployed to our devices, to the safety analytics that run on the cloud. Beyond providing immediate safety value to the drivers, our features play the important role of shaping their own evolution. If we want to improve the vehicle detection powering Forward Collision Warning (FCW), the first place we will look is the false positives triggered by FCW.
All Tesla FSD Visualizations and What They Mean
Tesla has slowly added more visualizations to the car display, showing what the car can detect and respond to in its environment. Tesla initially showed just road markings and some vehicles, but then slowly added more vehicle types, pedestrians and traffic cones. However, with the release of FSD Beta version 9, Tesla has drastically increased the amount of objects the car can visualize and interact with. The visualizations in the car aren't tied one-to-one with what the car is capable of detecting and using to make decisions. However, Tesla keeps visualizations and object detection closely coupled so that drivers have a good understanding of what the car can see.